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ABSTRACT 
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1 Introduction. 

The move-to-front (MTF) rule is an algorithm for a self-organizing linear list of a finite number of 
items, say, {1,2,..., N}. The list is updated in the following way. At each discrete unit of time, 
an item is requested, according to request probability pi > 0, i = 1, . . . , N . If the item is found at 
the kth position, it is moved to the top position and items in the first to the {k — l)th positions 
are moved down by one position. Successive requests are independent. This algorithm defines a 
Markov chain on the state space of the permutations of {1, 2, . . . , N}. There have been extensive 
studies on the MTF model, dating back to p6 i[20l[T5] . 

In \12\ [T3l [13] we studied a continuous time Markov process which we called the stochastic 
ranking process. The process corresponds to a Poisson embedding of the MTF chain into continuous- 
time [10| l3]. Each item makes jumps to the top with jump rate per unit time Wi (corresponding to 
Pi in the discrete-time model) independently of the others. 

Near the top of the list, popular (often-jumped or often-requested) items tend to gather, but 
there are always unpopular items mixed with popular ones. As a mathematically precise formulation 
of such an observation, we proved in [12] that, under appropriate conditions such as the existence 
of the limit jump-rate distribution A as ^ oo, the joint distribution /U^^'' of the jump rate 
(popularity) and the scaled position on the list converges as A^ — > cxd, and also gave an explicit 
formula for the limit distribution fit . We also obtained the expression for the boundary on the 
list between items that have jumped at least once and those that have not. Under an appropriate 
scaling, the boundary converges to a deterministic trajectory y = yc{t) as A^ ^ oo. yc{t) is given 
by the Laplace transform of the limit jump-rate distribution A: 



mentioned above has a general expression in terms of the inverse function to(y) of yc(^) ^-i^d its 
likes (see ([22]) or ([231) in Section [2|). 

After |121 [13] were accepted for publication, we learned that the MTF rule has been in the 
literature for nearly half a century |26l [20l [151 IS] j and has also been called self-organizing search, 
Tsetlin library [23] , or more recently, least-recently-used (LRU) caching [171 [25] . In spite of a long 
history of studies in the rule, the main results in |12[ I13j. which we summarize in Section [21 have 
escaped being noticed. Mathematical reasons why the curve y = yc{t) plays an important role in 
the formula for and also why its inverse function t = to(y) appears in fit (see ([22]) or ([23]) ) are 
studied in [13], where it is proved that ([22]) satisfies a system of non-linear Burgers type partial 
differential equations (PDF), which can be interpreted as a motion of mixed incompressible fluid 
driven by evaporation. An initial value problem for the PDF is solved by a standard method of 
characteristic curves, one of which is exactly the curve y = ycit)- The solution to the PDF is then 
written using the inverse function of the characteristic curves. In view of this result. Theorem [2] 
could be viewed as a mathematical result on a hydrodynamic limit. 

Our formula also has a direct practical application on the web. We noted in [13[ [T3] that the 
characteristic curve y = yc{i) is actually observed on the internet as the time-development of web 
rankings, which have become popular in the late twentieth century, as a result of the advance in web 
technology. In [131 [H] we studied the popularity rankings of topics on 2ch.net, one of the largest 
collected posting web pages in Japan, and the book ranking of the amazon.co.jp, the Japanese 
counterpart of amazon.com, which is a large online bookstore quoted as one of the pioneering 
'long-tail' business in the era of internet retails [1]. We performed a statistical fit of our model 
to the actual data, and showed that we can apply to these social and economical activities the 
stochastic ranking process with the (generalized) Pareto distribution as A. Statistical fits have 
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shown \13\ [13] that these social and economical activities are more 'smash-hit' based rather than 
long-tail, in contrast to the idea in pj. The values of the Pareto parameter < b < 1 have also been 
found in a study of document access in the MSNBC commercial news web sites [22j by directly 
counting the number of accesses. 

Returning to the studies in MTF rules, among the earliest works are \26\ [T5l I19j . where the 
formula for the stationary distributions of the MTF Markov chain is given. Another earliest stud- 
ies deals with the search cost, which is the position of the requested item before being moved. 
(Figuratively, we can imagine a heap of reference papers. Every time we need a paper we start 
our search from the top of the heap and after use we return it on the top.) The formula of the 
average search cost for the stationary distribution is first derived in [20] . Comparison of search cost 
probability with optimal ordering in the iV — > oo limit is considered [16]. The average search cost 
for stationary distribution has been studied in |20|, [5] and the comparison to that for the optimal 
ordering is found in [5l UHl [23l [6] • A formula for generating function of the search cost is obtained 
in [9]. Search costs for non-stationary cases have also been studied [21 [231 E M- There are also 
studies of the conditional expectations of search costs [10], cache miss (fault) probability in the 
least-recently-used (LRU) caching [3 [161 [T71 131 [IS] , aiid the cases of generalized Zipf law or Pareto 
distribution as the jump-rate distribution [H [HI [171 |25l [4] . For summary of various studies of MTF 
models, see, for example, [51 [TBI [25]. 

We will show in this paper that we can apply the mathematical results in [121 [13] to derive 
formula for the asymptotic distribution of search cost Cn , for general jump-rate distribution A. A 
basic formula in the case of stationary distribution is ([33|) : 



Using the formula above, we can obtain the asymptotics of the search cost probabilities, for general 
A. We have formula for non-stationary cases as well as the case of the stationary distribution (see 



The plan of the present paper is as follows. In Section [2] we summarize the main mathematical 
results in |12[ [15] . In Section [3] we use these results to derive the formula for the asymptotic dis- 
tribution of search cost for general jump-rate distributions, both for stationary and non-stationary 
cases. In Section [31 we reproduce and extend the formulas on asymptotics of the search cost prob- 
abilities in the literature, using the results in Section [3l to show that our formula gives a unified 
way of deriving the results for the search costs in the MTF model. 
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2 Stochastic ranking process. 

Let be the total number of particles aligned in a queue (records of information in a serial file, in 
terms of [20], or books on a single shelf, in terms of [151 IS])) and for i = 1,2, ■ ■ ■ , N , and t ^ 0, let 
X^^\t) be the position (ranking, in terms of [121 [131 [T3] ) of particle i in the queue at time t. 

The particles jump at random jump times to the top position of the queue. Denote by Tji^\ the 
time that particle i jumps for the j-th time to the top position. Namely, for each i, xj^'^\T^^^) = 1, 



lim Poo[ -ijCn > X ] 
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j = 1, 2, • • •. (t^^^ is the time of j'-th request of record i, in terms of [20], or the time that a book is 
requested and returned at the left end of the shelf 'nearest to the librarian's desk', in terms of [ISjlS].) 
Besides the jump to the top position, X^^\t) changes its value when some other particle nearer to 
the tail position jumps to the top and the particle i is pushed towards the tail to make room for 
the jumped particle. Namely, for each i' / i and j' = 1, 2, • • •, if x\^\t^!^^, — 0) < X^!^\t^,^J — 0) 
then xf\T^^,^),) = Xf\4^), - 0) + 1. Otherwise, X^^^\t) is constant in t. 

We assume that the jump times rj^^^ are independent in i, and are independent of xl'^\t), i = 
1,2, ■ ■ ■ , N , t ^ 0. For simplicity of notation, we put r}^'' = 0, i = 1, 2, • • • , A^, and further assume 

that for each i = 1,2, - ■ ■ ,N, {Ti^^i — t^^^ | j = 0, 1, 2, • • •} are independent whose distribution are 
identical for all j and are the exponential distribution 

P[ri^) ^t] = l-e-"'>'"*, t^O, (1) 

for a positive constant (the jump rate of the particle i) w^^'^ > 0. 

Alternatively, we may define = (xf \ • • • , X^f )) as a Markov process on the state space 

Sjy of the set of A^! permutations of {1, 2, . . . , A^}, with the Poisson jump times {t^^^ \ i = 
1,2, AT, j = 1,2,3, • • •} determined by ([I]). 

Note that with probability 1, j = 0, 1, 2, • • •, in ([1]) is strictly increasing, and that r-^^^ / 
T^f^jl for any different pair of suffices ^ unless j = j' = 0. We may (and will) therefore 

work on the event that these properties on t-'j'^'s hold. In particular, if we align the distinct random 
times rj^'J^ in an increasing order and denote the A;-th number by a'^^^k), namely. 



{a^^\k) I A: = 0,l,2,3,---} = {0}U{r(f I j = l,2,..., i = 1,2, ■ ■ ■ , N}; 
= a(^)(0) <a(^)(l) <C7W(2) < 



(2) 



then the stochastic chain Z^^\k) = {x[^\a^^\k)) , X^j^\a^^') (k)) , /t = 0, 1, 2, • • •, is a Markov 
chain on the state space of the permutations of (1, 2, • • • , A^), satisfying the move-to-front rules of 
[261 [20] , with the request probability p^^'^ of the record (or book) i given by 

Note also that a^^\k + 1) — a^^\k), k = 1, 2, • • •, are exponentially identically distributed inde- 
pendent random variables, with a common distribution 

P[f7(^)(l)^t] = l-e~("'i'^'+-+'"^^')*, t^O. (4) 

Let, as in [12] . ^(t) — ^{i G {1,2,- •• ,A'^} \ ^ ^ i] denote the boundary position in the 
queue such that r^^^ ^ t if xf\t) ^ x^^\t) and r/^^ > t if xl^\t) > a;[f^(t). Namely, the 
particles towards the top side of x^^\t) have experienced a jump by time t, while none of the 

particles on the tail side of x^^(t) has jumped up to time t. 
Denote the empirical distribution of jump rates by 

1=1 
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where, here and in the following, 5c denotes a unit distribution concentrated at c. Namely, for any 
set A, 

1 , if c G A, 
0, if c^A. 



/ 5c{dw) 

J A 



Proposition 1 ([12> Proposition 2]) Assume 

A(^) ^ a, iV ^ oo, (6) 
for a probability distribution A on [0, oo). Then for t ^ 0, 



y'c\t) := = ^^{^ G (1,2, . . . ,iV) | rf ^ ^ t} (7) 

converges in probability as N ^ oo to 

ycit) = 1-1 e-'^'Xidw). (8) 

O 







This result says that the trajectory of a particle starting at the top position is approximately given, 
for large N, by a deterministic trajectory (adjusting the origin of the time parameter t = to be 
the time that the particle is at the top position) 

/•oo l-OO ^ 

Nyc{t) = N{l- e-'"'\{dw)) N{1- e-"'*A(^)(du;)) = V(l - e"""- *), (9) 
Jo Jo 

as long as it remains in the queue (i.e., conditioned that it does not jump). This is easy to recognize 
by noting that the motion of a particle in the queue is caused by the random jumps of other particles, 
and that the law of large numbers replaces random jump times by their expectations. 
We hereafter assume ([6]), together with 

A({0}) = 0, (10) 

and 



f 

Jo 



wX{dw) < oo. (11) 

10 

As noted in [12, Proposition 3], yc ■ [0, oo) — > [0,1) then is continuous, strictly increasing, and 
bijective, hence the inverse function to : [0, 1) [0, oo) exists, satisfying 

yc{toiy))=y, 0^y<l, (12) 

and 

roc 

y = l- e-"'*«(^)A(du;). (13) 
Jo 

Differentiating ([8]) and (jl2p . we have 

J ^. poo 1 

-^(t) = / we-^'X{dw) = . (14) 

dt Jo dt, 

dy 
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Now, consider an ^ oo scaling limit of the empirical distribution on the product space of 
jump rate and position; 



where, 



(15) 



(16) 



We assume that the initial configuration of the queue {x[^\o), • • • ,X^j^\o)) = {x[^q , ■ ■ ■ ,x^^q) is 

such that the initial empirical distribution fi^^ converges weakly as ^ cxd to a probability distri- 
bution Ho whose second marginal is the Lebesgue measure on [0, 1); for almost all y £ [0, 1), there 
exists a probability measure jiy^ on the space of jump rates such that ^^{dw, dy) = Hyfi{dw) dy. 
To state our main result in [12] , We generalize ([8]) and define 



1 fOO 



yc{y,t) = 1- / / 

Jy Jo 



-wt 



lJLzfl{dw) dz, t ^ 0, ^ y < 1 . 



(17) 



In particular, yc{t) = yc{0,t). For each t ^ 0, yci'^t) : [0, 1) [ycit), 1) is a continuous, strictly 
increasing, bijective function of y, hence the inverse function y{-,t) : [yc{t), 1) — > [0, 1) exists: 



y(y,t) -JO 



-wt 



fizfiidw) dz, t ^ 0, yc{t) ^y <1. 



(18) 



In an analogy to ([9|), the particle initially at the position Ny, will be approximately at Nyc{y,t) 
at time t for large A^, provided the particle does not jump to the top position by the time t. It 
holds that 

dv , , 1 , , 

-{y,t) = ^ . (19) 



dy 



-wt 



l^y{y,t)fi{dw) 



Theorem 2 ( |12|, Theorem 5]) Assume I^10\) . and and the convergence of the initial 

distribution fi^^ as N ^ oo. Then the joint empirical distribution fi[^\dw,dy) of jump rate and 
position at time t converges as N ^ oo to a distribution fitidw,dy) = fiy^t{dw) dy on x [0,1), 
that is, for any bounded continuous function f : x [0, 1) ^ M 



The measure fiy^tidw) is given by 



1 / fOO 



fJ-yAd' 



W] 



we 



-wtoiy) 



X{dw) 



we 



-wtoiy) 



-wt 



X{dw) 



l^y(y,t)fl{dw) 



-wt 



l^y{y,t)fi{dw) 



f{w,y)Hy^t{dw)\ dy, in probability. 



y < yc{t), 
y > ycif). 



(20) 



(21) 



o 



7 



As noted in [12, §2.1 Remark], the assumption (jlip assures that /xo,t is well-defined. The main 
results in Theorem [2] for y > hold without (llip . 
This completes a summary of main results in |12j . 

It is notationally simpler to write (j21[) in a form integrated by y. Recalling (jl4p and ()19p . we 
have 



^ ^ Ae[i/,i) ' 1 /xo(dt/;,[y(y,t),l))e-"'*, y > yc(t)- ^ ' 

Essential points about the formula are the importance of the curve y = ycit), and appearance 
of its inverse function Iq as well as the inverse function y of yc{y^t)- An important observation 
in [13] concerning these points is that ()22p satisfies a system of non-linear Burgers type partial 
differential equations (see (j25p in Theorem [3] below). An initial value problem for (j25p is solved 
|13j by a standard method of characteristic curves, which precisely are the curves y = ycit) and 
y = yciy^t)- The solution to the PDE is then written using the inverse function of the characteristic 
curves. 

To be explicit, consider, in particular, the case that the limit distribution of jump rates A is a 
discrete distribution: A = Pa^fa^ where the summation is taken over finite or countably infinite 

a 

numbers, or equivalently, A({/a}) = pa, a = 1,2,---, where paS are positive numbers satisfying 
Pa = 1- For a = 1, 2, • • •, put 

a 

Uaiy,t):=pt{{fa},[yA))= f PzA{fa})dz, (23) 



^''^yh{dw), y < ycit), 



y 



and Ua{y) = / Pzfi{{fa\) dz for the initial data. Then ()22p is written as 
Jy 



^"^"''^"^ Ua{y{y,t))e-f'^\ y>yc{t). ^''^ 



Theorem 3 ( |13|, §2]) Under the assumptions in Theorem\^ is the unique (classical) so- 
lution to an initial value problem of a system of non-linear partial differential equations defined 
by 

^{y,t) + Y,UUpiy,t)^{y,t) = -fMy,t), (y, t) e [0, 1) x [0, oo), a = l,2,---, (25) 

with the boundary condition Ua{0, t) = pa a = 1,2, ■ ■ ■ , t ^ 0, and the initial data Ua{y, 0) = Ua{y), 
a = l,2,---. O 

This completes a summary of the mathematical part of the main results in [13] . 

3 Asymptotic distribution of search cost probabilities. 

In this section, we will relate our results summarized in Section [2] to the previous studies in move- 
to-front rules. 
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3.1 Search cost. 

A typical quantity of interest in the studies of move-to- front rules is the search cost Cat, which 
denotes the position of a particle just before its jump to the top. 
Let Q]^^ be the random variable defined by 

-(^)(l)=r^i,, (26) 

where a(^) is defined in ([2]). Then Q^^^ matches the definition of Qi in [20], and by definition, 

P[Qf^ = i]=pf\ i = lr--,N, (27) 

where pf^ is as in 1^. Cn (denoted by X in [20]) is then given by Cn = X^^i (^^^^''(1) - 0). Note 

that this is equal to X^^J^^ifi), because particles do not move before the first jump. We see from 

Theorem [2] that, under the assumptions of Section [21 Cm asymptotically scales as N in the limit 
that N ^ oo, and therefore the asymptotic properties of 



where 1^ is defined in (|16p . is of interest. 
3.2 Distribution of search cost: Stationary case. 

As noted in Section [21 the stochastic ranking process can be viewed as a continuous-time Markov 
chain on Sn- Namely, X^^\t) can be identified with an element vr = (vri, . . . , ttat) of Sn so 
that TTj = X^^\t), i = 1,...,N. The stochastic ranking process viewed as a continuous-time 
Markov chain on Sn, has the stationary distribution. (The stationary distribution is essentially 
the same as the stationary distribution of the move-to-front rules obtained by \26\ [T5] in a different 
way of correspondence, TTj being the label of the particle at the z-th position in the references.) 
Denote by Eoo (Pooj respectively) the expectation (resp., probability) with respect to the sta- 
tionary distribution for the initial configurations. If the distribution of the initial configuration 
{x^^Q , ■ ■ ■ ,x^^q) = {x[^\o), • • • ,X^j^\o)) is the stationary distribution, then it is the distribution 

of {X[^\t), . . .,xl^\t)) for all t ^ 0. In particular, for the /ij^^ in 



/x(^):=Eoo[^r] = Eoo[^rM, t^O. (29) 

Let f{w,y) be a bounded continuous function with compact support. Let < yo < 1 be such 
that f{w,y) = for y ^ yo ) and let t > to{yo), where to is as in (fT2]) . Note that Hy^t in (|2T]l for 
t > to{y) is constant in t and independent of the initial distribution. Theorem [21 together with 
Fubini's Theorem and dominated convergence Theorem, therefore implies 

lim // f{w, y) n';^\dw,dy) = lim // f {w , y) n[^\dw , dy) ] 

f{w,y)we''"^°^y^dyX{dw) 

(m;,j/)g[0,oo)x[0,1) 



Eoo[ / / f{w,y)fit{dw,dy) 



(t«,y)e[o,oo)x[o,i) I we~'"^°^y'^ X{dw) 

(30) 



CO 
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stationary distribution in (j29p converges as ^ co to 



This implies that the joint empirical distribution jj,^^ of the jump rate and the position under the 



(31) 



hm fi\^>[dw,dy) = fioo{dw,dy) := -y^ 







The distribution function of ^C'at in ([28]) in the stationary state is then given by 



Poo[ ^Cn >x] = '£f^[ y/^)(o) > X, qS^) = i ] 

i=l 

N N 



1=1 1=1 
N 



W, 



T3 \vWrn\^ 1 // wix^^\dw,dy) 



(32) 



i=i X^"'^-^^ / toA(^)(d'u;) 



where, we first classified the total event by the first particle to jump, and then used the independence 
of qS^^ and {l^/^^(0)}, and finally, ^ and ([29]). Combining ^ with ([32]), and changing the 
integration variable y to t = tQ{y), using (jl4p . we have 

w fj,ooidw,dy) 



(33) 



lim Poo[ -C;v > a; ] = M^e[o,oo)x(.,i) 

N^oo N 



oo 



w X{dw) 







e ""'dtw^Xidw) / e~^*«(^)u'A(d'u;) 

(«),i)e[0,oo)x(to(a::),oo) _ 



oo 



oo roo 

w \{dw) / X{dw) 

JO 



Similarly, we have, for a measurable function /, 
lim Eoo[/(^C^)] 

N—i-co iV 



wf{y)fioo{dw,dy) // f{yc{t))e'^'dtw^X{dw) r^^) 

(«J,y)G[0,oo)x[0,l) _ J J{«i,t)e[0,oo)2 ^ ^ 



TOO poo 

/ w X{dw) / w X{dw) 

Jo Jo 

Note that if (jlip fails, then the denominator in the right hand side of (j33p and (j34p diverges. 
3.3 Search cost: Comparison with optimally ordered case. 

Comparison between the search cost Cat for the move-to-front rules and the search cost -Rat when 
the particles are in the optimal static ordering, i.e., when the particles are arranged in decreasing 
order of request probabilities pi, has been extensively studied [5l [6| [16]. 
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For ^ X ^ 1, define w^^\x) by 

AW([0,u;W(x)]) = l[iV(l-x)], (35) 
where [N(l — x)] denotes the largest integer not exceeding A^(l — x). Noting ()27p . we have 

F[ R^>x] = ^l^ . (36) 

Jo 

Taking ratio to (j32p . and proceeding as in the derivation of (I33p . we have 

l-OO 

^i^oo — r = ^^^) ' o<-<i- (37) 

P[i^^^>a;] / wX{dw) 

Jo 

where, 

A([0,'u;(x)]) = 1 -X. (38) 

Note that all the N ^ oo limit results so far, except for ()37p . assume the condition (llip . whereas 

(j37p holds even if (jlip fails; / w X{dw) = oo. (See the remark after Theorem [2j) Furthermore, if 

Jo 

(jlip holds, then ()37p . with ()13p . ()38p and the dominated convergence theorem, implies 

/-oo 

Poo[^Ca.>x] wX{dw 

1 ^ 

P[ -Rn >x] J wX{d 



lim lim = ^ = 1, (39) 



which, considering a trivial equality Poo[ = ] = P[ Ji^-^N ^ ] = 1, is a natural result. In 

f oo 



poo 

contrast, (I39p may fail if / w X{dw) = oo. (See Section 

Jo 



3.4 Distribution of search cost: Non-stationary case. 

We can generalize ([55]) in Section 13.21 to the non-stationary cases. Let us return to the setting 
in Section [2] and assume that the initial value of the process is given: {x[^\o), ■ ■ ■ , X^\o)) = 
{x[l\--- ,x^^l). Let r(^)(t) = inf{c7W(A;) | a(^\k) > t} and define l(^)(t) by r(^)(t) = r^fl^m-^ j 
for some j. Define the search cost at time t by CAr(t) = Xj(N)(^f-j{t). We have, 

N N 

p*[ ^c^^(i) >x] = Y^Pt[ m) > X, /W(t) = i ] = ^p,[ y/^)(t) > X ] p*[ /(^)(t) = i ] 

i=l _ _ 1=1 

w fi[^\dw, dy) 



(40) 



— (Af)Pt[ Y}^\t) > x] _ J J{w,y)elO,oo)x{x,l) 



i=i v^. (W) / wX^^\dw) 
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Letting N ^ oo, we have 



AT-voo N 



w \{dw) 







where jiy^t is given by ([2T|) . 

We also remark that since (fSTj) coincides, for y < yc{t) , with the stationary distribution ([ST 
the speed of approach to stationary state is evaluated by ([8]): 



oo 

wt 



1 - yc{t) = / e-'^'Xidw). (42) 

JO 

4 Formulas related to search cost probabilities in the move-to- 
front rules. 

Some formulas related to the search cost for the move-to- front rules have simple forms, and naturally 
was found in the early studies. In this section we will derive formulas corresponding to some of 
such nice formulas, in the formulation of Section [21 

4.1 Average search cost. 

4.1.1 Asymptotic formula for the average search cost. 

In [20], the average search cost under the stationary distribution Eoo[ Cat ] (denoted by /x in [5]) is 
derived. Using the results and notations in Section[3]and Section^ we can calculate the asymptotics 
of this quantity. With (j34p we have 

1 „ , 1 / / 2„-M)tl 



hm Eoo[ ^Cjv ] = ^:oo / / ycit) w'e-'^'Xidw) dt. 







Using ([8|) and performing the integration with respect to to, we obtain 

1 1 f roo j-oc ^2 

^lim^Eoo[ -^Cat ] = -^Too ( / wX{dw) - / / -—^X{dw)X{dw) 




(43) 



w X{dw) 



/o Jo w + w 



Let us check that (I43p is consistent with the corresponding result in |20j (with notation changed 
to those we adopt here): 



TV TV JN)JN) 



°ol-iv J 2 ' Z^Z^ (TV) (TV)' 
■ 1 i=l Pi + Pj 
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With (El) and (El) we have 



1 1 1 ^ wf^wf^ 



/-^^ — rr^A(-)(..)A(-)( 



1 







oo 



wX{dw) 



JO 



w + 1() 



- X{dw) X{dw) , N ^ oo, 







which coincides with (1430. 



4.1.2 Comparison with search cost for the optimal ordering. 

One of the first studies on comparison of the search cost Cn with the search cost fijv for the optimal 
ordering introduced in Section 13.31 is found in 5j , which gives a following universal bound for the 
expectations: 

E^[Rn]S Eoo[ Cn]S2Eoo[Rn]- 1. 
Corresponding relations for N ^ oo then is 

hm 1Eoo[ Rn ] S lim ^Eoo[ Cn]S2 lim ^E^[ R^ ]. (44) 
To see that this relation follows from the results in Section l3l first note that 

N ^ N N 

2 



With (IHl) and (EI) we then have 



-I 1 f'OO f'OC -I 

Eoo[^i?^] = ^:oo / / mm{w,w}X^''\dw)X^''\dw) + —, 

^ 2 / wX^^\dw)-^^ 

Jo 



hence 



-|^ 2 /'OO poo 

lim Eoo[— -RAr] = — jr^ / / mm{w,w}X{dw) X{dw). (45) 

^ 2 w X{dw) 

Jo 



is now a simple consequence of (j45l) and (|43l) . if one notes a simple inequality 

— min{x,y} ^ — ^— ^ min{x,y}, x ^ 0, y ^ 0. 
2 a; + y 

We also note that there is a result [6j which proves that a Hilbert's inequality implies a stronger 
universal upper bound, which implies for the present case, 

hm 1Eoo[ Cn]S^ lim ^E^[ Rn ]• (46) 

N-^QO iV Z W— >oo iV 



13 



In fact, as derived in [6J we have, 

roo roo 



min{?i;, w} X{d'w) X{dw) 



wX{dw)X{dw) 



^0 Jw^w Jo 

- -A([w;,cx)))2 +-/ X{[w,^)fdw = - X{[w,^)fdw 



?i;A([w, oo))A(dt(;) 



and 



oo /-oo 



JO 

oo 



WW 
W + W 
WW 



W + W 



X{dw)X{dw 
X{[w,oo)) 



w=oo ,.oo r-oo / ^ \ 2 



Xidw) + 
w=0 _ Jo Jo \w + w 



w 



w + w 
2ww 



X{[w, oo)) 



X{dw)X{[w, oo))dw 
°° 2ww 



w=oo 

X{[w, oo))dw + 

rb=0 Jo Jo (W + W 



Jo {w + w)^ 



X{[w, oo))X{['w, oo))dwdw, 



X{[w, oo))X{[w, oo))dwdw 



Axy 



which, with the Hilbert's inequahty in the form llll §9.3] for K(x,y) = ^-r^, P = '7 = 2, and 

(x + vY 

9 = f^0; 



10 JO (x + y) 

imply (jl6]) . 



-^f{x)f{y)dxdy SkJ^ f{xfdx- k = K{x, 1)^ = = ^, 



4.1.3 Conditional expectations of search costs. 

In [5], the average search cost conditioned on specific particle i (denoted by /Xj in the reference), 
has been obtained. It is related to Eqo [ Cat ] by 



N 



Eoo[ Cn ] = ^ 



(iV) 



(47) 



i=l 



In terms of the conditional expectation Eoo[ Cat | ], conditioned on the sigma algebra 



(recall ([26|) ). we have 

Eoo[ I Qf^ Ku;) = /ii , if Qi^'Hu) = i. (48) 

With dSD we reproduce (liTD . 

In considering such quantities, we naturally come across the distribution of 'jumped particles', 
that is, the distribution of . Note that the time evolution of the system is dependent only 
on the jump rates. Therefore the search cost of particle i in the stationary state is dependent on 
i only through its jump ra 
distribution. In particular. 



i only through its jump rate w^^^; if w^-^^ = Wj^^ then the search cost for i and j has the same 



Eooi Cn\Q[''^]=E^[Cn\Wn], 



(49) 



where Wn = w^^l) ■ 
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Proceeding as in the argument for ([32]) . we have, for a bounded measurable function /, 



TV 

Eoo[ f{WN) ] = ^ f{wf ^)Poo[ Qf^ = i ] 



i=l 



N 



Wff^iN). ^ J ■/(«,,i/)g[0 ,oo)x[0,l) 



f{w)w fj.'^\dw,dy) 



oo 



Jo 

As in ()33p . Theorem [2] therefore imphes, for a bounded continuous function / 

f{w)w Hoo{dw,dy) I f{w)w\{dw 
^ h 

w X{dw) I w \{dw 



lim Eoo[ f{Wn) ] = " = . (50) 



N^oo 



In other words, the distribution of the jumped particle jump rates in the stationary state converges 

11 1 1 Xidw) 
weakly to a probability measure —^55 , as A* ^ 00. 



w X{dw) 

Since to(0) = 0, this distribution is equal to //o,oo in ([3T]) . which is the distribution at the top 
end of the queue. An intuitive meaning of this equality is that the jumped particles jump to the 
top position (the requested records are placed at the top position) so the distribution at y = is 
the distribution of the jumped particles. 

As noted in (09]), to obtain the average search cost of a specific particle i (denoted by /ij in [5]), 
it suffices to calculate the average search cost conditioned on the jump rate of the jumped particle 
/{Wn) = Eoo[ Cat I Wn ]■ A basic property of conditional expectation, with ([l3l) and (f50]l . implies 



1 r°° WW 1 1 

/ / —^X{dw) X{dw) = lim Eoo[ -^Cn ] = lim Eoo[ Eoo[ -^Cn \Wn]] 

wXidw)-^^ 

1 1 

/ lim Eoo[ -Cn I Wn ]{w) w Xidw). 

Jo A 



w X{dw 
Thus we find 



1 r°° w 

lim E^[Cn\Wn]{w) = / —^X{dw). (51) 

Af^oo A* Jq w + w 



This result is to be compared with /ij in [5, Eq. (10)], which reads in our notation, 

1 111^ 
-T^^^[CN\W^]{w.) = -f^. = — + -Y,. 



N ^(N) 



For large N, ^ then implies 
which is consistent with (1511). 
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4.2 Cache miss probability. 

If X ^ yc{t) we can reduce (jUJ further and have 



hm Pt[ ^Cjv(t) >x] = „oo • (52) 



N^oo N 



w X{dw) 







This is because the hmiting distribution fiy^t for y < yc(t) is equal to that for stationary case %,oo- 
(See (f2T]l and ([311) . ) Hence we have, for x ^ yc(i), 

hm P,[lcAr>x] = l- hm Pi[i-C^^x] = l- hm P^[1ca.^x] 

N^oo iV N^co iV N^oo I\ 

= hm Poo[ ittCat > X ], 



so that §^ imphes (f52|) . 

The cache miss (fauh) probabihty in the least-recently-used (LRU) caching has been one of 
the modern area of extensive study in the application of the move-to- front rules [3 [HI [T71 HI [25] . 
If there is records of information in a computer memory, or web pages on the internet, out 
of which A^x records or pages, respectively, can be cached for a further quick access, the event 
Cat > A^x represents cache miss or cache fault, by regarding particles as records of information or 
web pages to be accessed. The probability ([52|) is therefore of interest. 

In particular, [4] considers a quantity, defined, in our notation, by 

M(^)(t) = P*[^C;v>yrW]- (53) 

Recalling the definition ([7|) of y^^'' , we see that M^^^ (t) is the probability that the jump at time 
t is the jumped particle's first jump since t = 0. M^^\t) therefore corresponds to the cache miss 
(fault) probability in an ideal case that all the once requested records are stored in a cache memory 
of ideally large size. 

Since the limiting distribution (f4TI) of '^^n is continuous and and y^\t) converges in proba- 
bility to yc{t), we have 

hm M(^)(t) = hm Pt[ ^CN{t) > yc{t) ]. (54) 
Substituting x = ycit) in (f52]) . we have 



M{t) := lim M(^)(i) = lim Ft[ -C7v(t) > yc{t) ] = . (55) 



oo 







w X{dw) 



(N) 



Note that M{t) is independent of the initial configuration /ig 
4.3 Case of generalized Zipf law or Pareto distribution. 

In the preceding subsections, we dealt with formula for an arbitrary distribution of the jump rates A. 
In the literature, there are formula for specific request probabilities, among which the generalized 
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Zipf law (also known as power-law) is of importance in practical applications. Let a and b be 
positive constants and consider the jump rates 

/ATX 1/6 

Wi = ai-j , i = l,2,3,---,iV. (56) 

logiV 

In applying to move-to-front rules, a = wn is the smallest jump rate and b = ^jj^ is an exponent 

log 

WN 

representing the equality of jump rates among the particles. 

In \13\ rn] we studied the rankings of 2ch.net and amazon.co.jp. 2ch.net is one of the largest 
collected posting web pages in Japan. Posting web pages are classified by categories ('boards'), and 
each category has a list of topics of posting web pages ('threads'). These lists are updated by the 
'last-written-thread-at-the-top" rule. Amazon.co.jp is the Japanese counterpart of amazon.com, 
which is a large online bookstore quoted as one of the pioneering 'long-tail' business in the era 
of internet retails [1]. They show sales ranks of all the books on their catalogs. We have shown 
that we can apply the stochastic ranking process with the (generalized) Pareto distribution for the 
distribution of jump rates in these social and economical activities, and by performing statistical 
fits of the data from these web results, we extracted the index b in (f56l) . We obtained b = 0.61 
for 2ch.net and b = 0.81 for amazon.co.jp, both indicating < 6 < 1, which implies that these 
social and economical activities are more 'smash-hit' based rather than long-tail, in contrast to 
the idea in [1]. The values in < 6 < 1 has also been found in a study of document access in 
the MSNBC commercial news web sites |22] by direct measurements (that is, the distribution A is 
directly measurable in the study of [22] and a theory of move-to-front rules is unnecessary). 

Let us turn to the search cost probabilities. A of dll) is readily calculated: 

r 0, O^w <a, 

The continuous distribution A determined by (j57p is called the (generalized) Pareto distribution 
[2T] (or log- linear distribution), especially in social studies, and is used to explain various social 
distributions, typically that of incomes. 

With ()38p we have w^x) = ax~^^^, and the denominator in the right hand side of ()37l) is 

^{^) ah 

wX{dw) = -^{x^-^^^ -I). (58) 







For the numerator of (1371) we have 



1-6^ 



i 



°° f°° / n \ b b 

-^M^)wX{dw)= e-"'*"(^)6(-) dw = —-{ato{x)fr{l-b,ato{x)), (59) 

Ja to{x) 



/•oo 

where T(z,p) = / e~'^w^^^ dw is the incomplete Gamma function. To evaluate this, we recall 
Jp 

and perform integration by parts, to find 
roo u b 

/ ^-wto (x) ^ g-ato (x) _ (^^^ {x)fT{l-b,ato{x)). (60) 



1 — X 
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Substituting ([58]), (159]), and ([601) in ([STD, we have 



liin = t-tti • (61) 

This formula is vahd for all 6 > and < x < 1. 
Concerning the condition (jlip . we see from (j57p . 

tt;A(ci?i;) = , (62) 

so that (jlip is equivalent to 6 > 1 for the Pareto distribution. Hence, as discussed in Section 13.3^ 
()39p holds if 6 > 1. In contrast, if < 6 < 1, then noting lim tQ{x) = (which is seen from the 

definition P^ ). we have lim r(l — b,atQ{x)) = T{1 — b), and ([SH]) implies 



aHx) ~ j . 2:^0 ifO<5<l, 

and (j6ip then implies 

lim lim ^ = (!-&) r(l - hf/\ (63) 

x^+OAT— >oo or D ^ 1 

The quantity in the right hand side of this result is obtained in [16' Theorem 3]. Note that the 
reference formulates N = oo case from the beginning (in our notation, this is attained by letting a 
to be proportional to N~^/^ in (I56p ). and a limit n — > oo is taken in Theorem 3 of [16]. We begin 
with N ^ oo, fixing x, and then take x — > limit in (j63p . Rigorously speaking, these are different 
limits and (|63p is a new result. However, since our x and n in [16] are related by n = Nx when 
N < oo, both results are consistently talking about 'large N, large n, and small x' for < b < 1. 
Concerning (I34p . a general formula for the expectation of search cost, we have 



where, ([8]) implies 



1 r°° 

lim E^[f{-CN)] = {b-l){at)'-^a f{yc{t))T{2 - b, at)dt, (64) 

f^OO I\ Jq 

ycit) = l-biatfri-b,at). (65) 



Noting that 

^y^{t) =ab{atf-^r{l-b,at) 



dt 

and an integration by parts formula 

T{z + l,p) = e"V + zT{z,p) 
for the incomplete gamma function, we have another expression 
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It seems, however, difficult to simplify the formula for general /. 

Concerning the miss probability M{t) of (I55p . the denominator is finite if 6 > 1, and we have, 
after an integration by parts, 

POO 

M{t) = {h- l)[atf-^ / e-^x-^dx = e""^* - {atf-^T{2 - b, at). 

J at 

For I < b < 2 this implies 

M{t) = l-r{2-b){atf-^ + 0{at), t ^ 0. (67) 

In [3j the web caching is studied, in which the hit-ratio for the i?-th request is defined, in our 
notation, by 

/^^(i?) = 1-M(^)(a(^)(i?)), 

where M^^\t) is defined in §3\) and (t(^)(/c) in With (HZD and properties of a^^) (see (g])), 
together with law of large numbers, we see that H{R) = lim H^^\NR) scales as R^^^. This is 

consistent with the argument in [5j which claims H{R) oc R^^^ for 1 <^ R <ti N. The reference 
further obtains 1/6 = 0.83 — 0.90 (6 = 1.11 — 1.20) using actual web data. 
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